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ABSTRACT 



Students in a college introductory statistics class 



were evaluated with conceptual and computational tests and the 
relationship between their levels of knowledge on the two forms of 
testing was assessed. There were significant correlations between 
their abilities to perform computations and to answer more conceptual 
questions on individual tests and across separate tests^ for the 
final examination^ and for the total number of points earned 
throughout the semester. The correlations indicate that the two 
styles of testing provide partially redundant ^ but not totally 
overlapping^ information about student knowledge. Further^ althoigh 
student averages did not differ in the two types of tests/ they 
seldom preferred only conceptual te::;ts of their knowledge^ judgit g 
the computational tests a better mea'is of evaluation. Applications of 
these results extend to prediction of missing test scores and to the 
testing of students for whom English is a second language. 
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Abstract 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)" 



Students in an introductory statistics class were evaluated 
with conceptual and computational tests and the relationship 
between their levels o-f knowledge on the two -forms o-f testing was 
assessed. There were signi-ficant correlations between their 
abilities to per-form computations and to answer more conceptual 
questions on individual tests and across separate tests, -for the 
-final examination, and -for the total number o-f points earned 
throughout the semester. The correlations indicate that the two 
styles o-f testing provide partially redundant, but not totally 
overlapping, information about student knowledge. Further, 
although student averages did not di-f-fer in tlie two types o-f 
tests, they seldom pre-ferred only conceptual tests c-f their 
knowledge, judging the computational tests a better means o-f 
evaluation. Applications o-f these results extend to prediction o-f 
missing test scores and to the testing o-f students -for whom 
English is a second language. 
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Student Knowledge o-f Statistics: 
Tc Know IS to Do 
Bernard C. Beins 
I thaca Col 1 ege 

Statistics classes are seen as critical by teaching psychologists, as 
evidenced by the nearly universal requirement that our majors take such a 
course <Bart2. 1981). f^i the same time, students seem to show a marked 
aversion tor the class, and instructors have attempted to ameliorate this 
situation through a number o-f techniques (e.g., Beins, 1985; Dillbeck, 1983; 
Hastings, 1982; Jacots, 1980). In a continuing attempt to idenLi-fy variables 
that will lead to a success-ful course, this paper wili report on the relative 
e-f-ficacy o-f two testing -formats in a statistics course, present an analysis o-f 
student responses to the test= and suggest some applications o-f this 
knowl edge . 

Method 

Sub jec ts 

Thirty-one students enrolled in an introductory level statistics class 
provided the data -for this study. The class consisted largely o-f students 
majoring in social and behavioral sciences or nursing. 
Procedure 

Students m the class took -four hourW tests and a cumulative -final exam. 
Each test consisted o-f two portions — an initial, closed book segment involving 
multiple choice, sentence completion and -fill-in items, and de-f i n 1 1 1 ons . A 
computational segment involved the typical -form o-f statistical tests, with 
students selecting statistical tests when given experimental descriptions, 
creating graphs and -figures, and solving problems; students were permitted the 
use o-f books, notes and calculators without statistical capabilities (e.g., 
those that will compute sums o-f squares, means, variances, automatically) ior 
the computational part. The tests were ^'cheduled -for a 50-minute class 
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period, although the students could use the 10-minute interclass interval i-f 
they desired. 

Near the end o-f the term, the students also responded in class to an 
anonymous questionnaire about their impressions o-f the structure o-f the 
course. The questions o-f relevance here were "Did the open book (closed book) 
portions o-f the tests re-flect your knowledge o-f statistics?" Responses were 
on a seven-point scale. On the last day o-f classes, they were asked whether, 
i-f given a choice, they would pre-fer their grades to be based on the 
conceptual (closed-book) or computational <open-book; test, or on a 
combination o-f the two. I made it clear to them that their course grades 
would be based on scores -from both kinds o-f tests. They were asked to write 
their names with their replies to this question. 

Resul ts & Discussion 
Relationship bet?v;een conceptual and computational tests . There was a marked 
association between students' scores on the computational and conceptual 
segments o-f the tests. When the point totals -for all quizzes were summed, the 
results revealed that students who did well in the computational portions also 
did well conceptually, r(29) = 0.63, p <. 0.001. Likewise, conceptual and 
computational t'inal exam grades are related, r<29) = 0.49, p < 0.01. 

In general, the tests showed a considerable degree o-f i ntercorre 1 at i on . 
Pairing each possible set o-f scores on conceptual and computational segments 
-for each test yielded 45 possible correlations. A z-test revealed that, o-f 
these, 23 were signi-ficant at or beyond the 0.05 alpha level with a two-tailed 
test; another two were signi-ficant with one tail. With respect to the 
conceptual -cornpu tat i onal relationships tor individual tests, only quiz 1 
■failed to show a signi-ficant correlation. The values appear in Table 1; the 
means and standard deviations for each test also appear m the table. The 
-first test in a given class may be a time o-f adjustment -for many students m 
which they acclimate to the nature of' the test, so a low correlation is not 
totally surprising. The other quizzes seemed to indicate that when students 
grasped the concepts, their computat i ons were also per-formed adequately. 
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although quiz 3, whrch produced the highest mean on C'oth conceptual and 
computational portions, showed inconsistent corre 1 at ions w i th performance on 
other tests. It also showed the highest means and relatively small 
dispersion, both oi which may have led to the low correlation, signi+jcant 
only at p = 0.08, 



Insert Table 1 About Here 



Student responses to the tests . In addition to considerations oi the 
relationship between conceptual and computational test modes, it might be 
use-ful to know whether student pre-ferences -for di-f + erent Kinds oi tests are 
related to ultimate per-formance in the class. Consequently, I per-formed an 
Anal/sis oi Uariance to see whether those students who professed a pre-ference 
■for the open-book -format <n = 12) wound up with higher average grades than the 
students -favoring the combination -format \n = 9), There was no di-f-ference in 
the class averages <in percentages; between students who wanted open-book 
tests alone <M = 71.31/ and those who would have liked both (M = 68. 13"), 
F(l,19> = 2.11, p= 0.159. Likewise, the per-formance on the conceptual part 
<M = 66.22) and on the computational part (M = 73.21> were not s i gn i -f i cant 1 v 
dj-f-ferent , F<1,19) = 2.134, p = 0.157. The interaction was also 
non-s J gn I -f { can t , F < i . 

As a part o-f a -final, anonymous questionnaire \-filled out on a day when 
28 o-f the 29 enrollees were present;, students were queried about the degree 
to which they -felt that the open- and closed-book segments tested their 
knowledge. Rating each -format on a seven-point scale, the students judged the 
Open-Book portions (Mean = 2.71) as being better tests o-f their knowledge than 
Closed-Book segments <Mean = 3.75), t(27) = 2.78, p< 0.01. Their belie-f that 
the open-book, computational segments were better tests o-f their knowledge is 
re-flected only in nons i gn i f i can 1 1 y higher mean scores on those segments, 
t(28) = -1.76, p >0.05. Scores on the open-book tests were only 3.16 percent 
higher than closed-book scores: 70.03 versus 66.77, respectively. 
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(Obviously, the magnitude o-f this di-f-ference will vary according to the 
relative di -f-f i cul tx o-f the two segments; the important point here is that 
one's -feelings o-f competence or oner's reaction to the nature o-f the test may 
not adequately re-flect actual knowledge or adequao* o-f the test instrument.; 
Thus, the students showed le£5 sat i s-f ac 1 1 on with the conceptual portion o-f the 
test even though their scores were not systematically lower than the 
computational part, and the two were actually related to one another. When 1 
when through a hypothetical exercise o-f assigning grades based on the 
conceptual and computational scores separate!/, the di-f-ference in grade point 
averages -for the class was relatively small: 2.07 -for conceptual versus 2.24 
-for computational. This di-f-ference re-flects a slightly lower percentage 
average on the concepts combined with my own standards as to the minimum score 
that IS appropriate -for a particular letter grade. 

Th? results here suggest that a conceptual test o-f statistics might not 
be totally indispensable in assessing student knowledge in the class. The 
consistent 1 high correlations between computational and conceptual 
in-formation makes it seem plausible that students could be given a test whose 
-format is consistent with their own desires. It should be noted that, simply 
because a student selects one -format over the other, the resultant grade WiH 
not necessarily be higher than i-f a di-f-ferent structure were used. To 
illustrate this -fact, I will point out that o-f two studentf= who would stated a 
pre-ference -for only closed-book tests, one received a gradi o-f A in the 
course, the other an F. At the same time, i -f an mstriictor wants as complete 
an assessrrient o-f the students' knowledge as possible, the two di-ft'erent 
-formats can be used; the results, while correlated, are not per-fectl/ 
overlapping and the two test types are not totally redundant. 

The pattern o-f pre-ferences by students suggests that they probably -feel 
that closed-book, conceptual tests are harder, as re-flected in their belie-f 
that open-book tests assess their knowledge better than closed-book tests. 
This tn-ference is based on my assumption that so-called "easier" tests are 
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viewed more positive]/ because the/ rein-force the students' desires to -feel 
competent in statistical procedures, 

Appl i cat i ons , There is another reason to consider including both kinds o-f 
tests, The -findings here provide a potential!/ use-ful means o-f estimating 
missing test scores, A simple linear regression technique could be employed 
instead o-f mere!/ using the student^s overall average 5S the estimate. 
According to the present data, the -final exam score could serve as a predictor 
o-f a missing test score, as could the total point value accumulated over all 
tests or, i-f the computational and conceptual tests were given on separate 
days, one o-f these two could be used quite adequately to predict the other. 

Another potential use -for this estimation technique involves the testing 
o-f students whose native languages m not English, By the admtssion o-f one o-f 
the two students in this class -for whom English is not the -first language, the 
closed-book tests posed di-f-ficulty with respect to language considerations. 
When predictions o-f total conceptual scores -for both students were made -from 
the computational scores, the di-f-ference between actual and predicted scores 
was not signi-ficant in either case, z = 0,36 and 0,42, both p's .> 0,05, Both 
o-f these students had at least adequate conversational skills in English, so 
It is not clear students with a poorer command o-f English would be able to 
understand the textbook and the problems, eren i -f the skill level is 
potentially very high. One reason -for optimism here i3 that neither the 
student with sel -f-admi tted language problems nor the other student deviated 
s I gn ; -f I cant 1 y -from their predicted conceptual scores, based on the 
computational. It may be the case that when a student has a basic ability to 
comprehend a textbook, the linguistic -factors involved in computation are not 
important, An instructor would need to decide whether a poor grade on the 
conceptual part o-f a test was due to language de-f i c i enc i es or to lack o-f 
statistical knowledge, but i -f the low mark were due to linguistic -factors, an 
estimate o-f the language-based portion o-f the test might prove sat i s-f ac tory . 
The criterion -for using the estimate rather than the actual score might rest 
on the instructor's subjective assessment o-f the student, or it might be based 
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on s/stemat f cal 1 / lower grades on language-dependent questions, or on an/ o-f a 
number o-f di-f-ferent -factors. One ca^^eat, howe^^er, is that a poor command o-f 
English may ultimately a-f-fect computations as well. Obviously, consideration 
o-f mitigating -factors will be required in adoption o-f any o-f these 
suggest i ons . 
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Table 1 



tests 


and -final 


e X am . (Means 


and standard deviations are in 










Computational 




Conceptual 












Test « 1 


2 3 4 


Final 


1 2 


3 


4 


Final 


Mean 


S.D. 




1 


.20 .53» .15 


.28 


.12 .4C» 


.26 


.1? 


.20 


.69 


.20 




2 


- .44» .20 


.52» 


.26 .54» 


.10 


.26 


.41* 


.76 


.15 


Comput. 


3 


- .26 


.45» 


.45* .32 


.32 


.55* 


.46* 


.86 


.14 




4 




.45» 


.44* .48* 


.22 


.37* 


.42* 


.58 


,22 




Fnal 






.20 .51* 


.32 


.45* 


.4?* 


.61 


.12 




1 






-- .39* 


.16 


.62* 


.24 


.80 


.20 




2 








.22 


.51* 


.54* 


.73 


.20 


Concept 


3 










.42* 


.13 


.85 


,15 




4 












.31 


.73 


,19 




Final 














.62 


.19 



*p < 0.05 

^Test topics are as follows: 



Test 1 
Test 2 
Test 3 
Test 4 
Final : 



Graphing, Correlation/Regression 
NoriTial distribution; Sampling distributions 
Statistical inference; one-sample z-test and t-test 
Two sample t-test; analysis of variance 
Cumulative test 
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